Multivariate random forests

نویسندگان

  • Mark R. Segal
  • Yuanyuan Xiao
چکیده

Random forests have emerged as a versatile and highly accurate classification and regression methodology, requiring little tuning and providing interpretable outputs. Here, we briefly outline the genesis of, and motivation for, the random forest paradigm as an outgrowth from earlier tree-structured techniques. We elaborate on aspects of prediction error and attendant tuning parameter issues. However, our emphasis is on extending the random forest schema to the multiple response setting. We provide a simple illustrative example from ecology that showcases the improved fit and enhanced interpretation afforded by the random forest framework. C © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 80–87 DOI: 10.1002/widm.12

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the asymptotics of random forests

The last decade has witnessed a growing interest in random forest models which are recognized to exhibit good practical performance, especially in high-dimensional settings. On the theoretical side, however, their predictive power remains largely unexplained, thereby creating a gap between theory and practice. The aim of this paper is twofold. Firstly, we provide theoretical guarantees to link ...

متن کامل

A Copula Based Approach for Design of Multivariate Random Forests for Drug Sensitivity Prediction

Modeling sensitivity to drugs based on genetic characterizations is a significant challenge in the area of systems medicine. Ensemble based approaches such as Random Forests have been shown to perform well in both individual sensitivity prediction studies and team science based prediction challenges. However, Random Forests generate a deterministic predictive model for each drug based on the ge...

متن کامل

An introduction to geostatistics with R/gstat

7 Feature-space modelling 16 7.1 Theory of linear models . . . . . . . . . . . . . . . . . . . . . . 16 7.1.1 * Least-squares solution of the linear model . . . . . . . 17 7.2 Continuous response, continuous predictor . . . . . . . . . . . . 18 7.3 Continuous response, categorical predictor . . . . . . . . . . . . 23 7.4 * Multivariate linear models . . . . . . . . . . . . . . . . . . . . 25 7....

متن کامل

Signal Enhancement Using Multivariate Classification Techniques and Physical Constraints

We report on an empirical comparison of several multivariate classification techniques (e.g., random forests, Bayesian classification, support vector machines) for signal identification; our experiments use K* mass as a test case. We show 1) the effect of using different cost matrices in generalization performance and 2) how information about physical constraints obtained from kinematic fitting...

متن کامل

Functional Data Classification with Kernel-Induced Random Forests

Scientists and others today often collect samples of curves and other functional data. The multivariate data classification methods cannot be directly used for functional data classification because the curse of dimensionality and difficulty in taking in account the correlation and order of functional data. We extend the kernel-induced random forest method for discriminating functional data by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2011